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Abstract 

Three methods of augmenting computer networks by adding at most 
one link per processor are discussed. 

1. A tree of N nodes may be augmented such that the resulting 

graph has diameter no greater than A |^°g 2 (“Tp)j “2. This O(N^) 

algorithm can be applied to any spanning tree of a connected graph to 
reduce the diameter of that graph to O(log N). 

2. Given a binary tree T and a chain C of N nodes each, 
C may be augmented to produce C' so that T is a subgraph of C'. 
This algorithm is 0(N) and may be used to produce augmented chains 

or rings that have diameter no greater than 2 j an( * are 

planar. 

3. Any rectangular two-dimensional 4 (8) nearest neighbor array 
of size N = 2^ may be augmented so that it can emulate a single 
stage shuffle-exchange network of size N/2 in 3 (2) time steps. 
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INTRODUCTION 


We show how the capabilities of an existing computer network can 
be improved by adding at most one communication link per processor. 
In particular, we show how at most one edge per i\ode need be added to 
an arbitrary N node connected graph in order to reduce its diameter 
to 4 Jlog 2 (¥>1 - 2. Since the diameter of a network determines 

the maximum time required to communicate between a pair of nodes, this 
result allows us to improve the connectivity of a network in a very 
crucial fashion* This is a generalization of a previously reported 
algorithm that was applicable only to special types of graphs [4]* 
The cost of this improvement is at most one I/O port per processor and 
thus no more than N/2 additional communication links* This is 
discussed in Section III of this paper* 

In Section IV we describe how a chain or ring of processors can be 
augmented by adding at most one edge per node so that a given binary 
tree may be perfectly mapped on it. This algorithm has complexity 
0(N) and generates graphs that are planar. It is thus a significant 
improvement over the previously reported algorithm [4], which was 
0(N ) and did not guarantee planarity of the augmented graph. This 
allows us to reduce the diameter of a ring from N/2 to 2|"log2 ("3~)"j 
thereby speeding up the execution of those algorithms that require 
global data operations, such as sorting. This algorithm can obviously 
be applied to any graph that is Hamiltonian and can therefore reduce 
the diameter of a k-dimensional nearest-neighbor array from 0(N^^) 
to 0(log N). For the case of two-dimensional arrays the extra edges 


require only one additional layer of interconnect. These results 
permit us to construct array processors that combine all the 
advantages of nearest-neighbor arrays as well as those of tree 
machines. 

Section V describes how a A (8) nearest neighbor array may be 
augmented so that it can execute the shuffle— exchange permutation [12] 
in 3 (2) time steps. This allows us to combine the benefits of 
nearest neighbor arrays and permutation networks in one machine. 

Section VI contains a discussion of our results. We start with a 
section on definitions. 


II. DEFINITIONS 

We will consider only undirected connected graphs in what follows 

[5]. A graph is denoted G = ^V,E^, where V is the set of nodes and 

E the set of edges. The distance d(x,y) between any two nodes 

x,y contained in V is the length of the shortest path joining x 

and y. The diameter of a graph G is the maximum distance between 

any two nodes in the graph. That is diameter = max d(x,y). 

x,y 6 V 

A tree is a connected undirected acyclic graph with N nodes and 
N-l edges. A rooted tree is a tree with an explicitly designated 
root node. Each edge of a rooted tree connects a father node to a son 
node, where the father occurs before the son in the path connecting 
the root to the son. All nodes without descendants are called leaf 


nodes . 
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A chain is a tree whose degree is constrained to 2. It has 
exactly two leaves and all of its non— leaf nodes have degree exactly 

2 . 


A binary tree is a rooted tree in which the maximum number of 
sons of any node is 2 • The two sons of a node (if they exist) are 
called leftson and rightson. 

A Moore 3— tree is a rooted tree in which all non— leaf nodes have 
degree exactly 3. The root thus has 3 sons and all other non— leaf 
nodes have 2 sons each. 

The height h of a rooted tree is the maximum distance from the 
root nodes to any leaf node. The diameter of a rooted tree cannot 
exceed 2h or be less than h. 

In a complete binary tree of N nodes the distances from the root 


node to any two left nodes can differ by at most 1 and there can be no 
more than one non-leaf node with only one son (Fig. 1). The height of 
this tree is h « J"log 2 (N+l)^ -1 and its diameter Is no more than 


Similarly, in a complete Moore 3— tree of N nodes , the distances 
from the root to any two leaf nodes can differ by at most one and 
there can be no more than one non— leaf node with only one son (Fig, 
2). The height of such a tree is h - |\og 2 (-~-)^ and its diameter 
is no more than 2h. 


Of trees with N nodes and degree constraint 3 f a complete 

Moore 3-tree has minimum diameter. 
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A graph is said to have been mapped onto another graph G 2 

if: 1) Gj and G 2 have the same number of nodes and 2) each node 

of G^ has been assigned to a node of G 2 in a one-to-one onto 

fashion. If every edge of Gj falls on some edge of G 2 then the 
mapping is called a perfect mapping . In any case, the number of edges 
of Gj that fall on edges of G 2 is called the cardinality of the 

mapping [2]. 

When Gj is a tree, the perfect mapping of Gj onto G 2 is the 
same as the spanning of G 2 by Gj. Gj is then a spanning tree of 
G 2 in the conventional sense [5]. 


III. REDUCING THE DIAMETER OF A NETWORK 

Given a tree T of N nodes, we will show how to augment it by 
adding no more than one edge per node so that the resulting graph has 
diameter no greater than 4|"log 2 (-^p-)"|-2. This algorithm can 

clearly be applied to any spanning tree of an arbitrary connected 
graph of N nodes to obtain 0(log N) diameter. 

Let T be the given tree of N nodes. Construct a complete 

Moore 3-tree M of N nodes. Let U be a set of trees that 


initially contains T 
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1* donei^false; 

2 . while not empty(U) and not done do 
begin 

3* select the tree u in U that has the 

maximum diameter d; 

4* remove the chain of nodes c from u 

which lies along this diameter; 

5. return u-c to U; 

6* remove the longest chain m from M, 

let its length be p; 

7* if d < p then done: tt true 

else 

begin 

place nodes 1 to p of c on nodes 
1 to p of m; 

9* return the chain formed by nodes 

p+1 to d to U; 

end 

end 

10. reconnect all edges removed in step 4; 

11. add edges between all non-ad jacent nodes of T 

that were mapped on adjacent edges of M in step 8; 


This algorithm attempts to map successively smaller chains from 
T onto successively smaller chains of M. If this algorithm terminates 
without the condition of line 7 being satisfied, then all of T will 
have been mapped on all of M with at most one edge added to each 
node of T in step 11. In this case the diameter of the augmented 
graph will be no greater than that of M, that is 2|"log 2 (^~)"J . 
This will happen, among other cases, if T is a chain, when our 

algorithm reduces to an inverse of the algorithm described in [A] for 
mapping degree constrained trees onto chains. 

The interesting case is when the condition of line 7 is 

satisfied and the diameter of the longest remaining chain in T is 

less than the length of the longest chain in M. Now a part of T, 

sa y ^m* bas been mapped onto a part of M and, after adding edges, 
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will become T' with diameter no greater than 2 |'log 2 (^j^)'j . No 

component in the remaining portion will have diameter greater than the 
longest remaining chain in M. This is maximum when the condition is 
satisfied on the second pass through the while loop. (If this is true 
during the first pass, then the diameter of T is less than the 
diameter of M and there is no point in running this algorithm.) The 
diameter of the longest remaining component C ffl in T-T m is thus at 
most 2 log 2 (^i^-)-lj-l. 

The overall diameter of the augmented graph cannot exceed 
+ This is proved by contradiction. Suppose the 

diameter exceeds this amount. Then there exists some component of 
T ~ T m » say c n » such that the diameters of C m ,T' and c n lie along a 
chain that has length greater than dia(C m ) + dia(T'), (Fig. 3). This 
is impossible because when mapping nodes from chains of T onto 
chains of M we always start at one end of the chain. Thus when 
extracting the last successfully mapped chain from T, we would have 
started at an extreme end of C m or C n and would have have been 
left with a chain of length dia(C n ) + dia(C m >, thereby contradicting 
our assumption that the condition of line 7 is satisfied. The overall 
diameter of the augmented graph is thus no greater than 

*K(¥)1- 2 - 

The running time of this algorithm is O(N^) since it requires 
the calculation of distances between all pairs of nodes (an O(N^) 
process) as much as N times. 
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IV. NAPPING BINARY TREES ONTO CHAINS 

We now describe an algorithm that, given an arbitrary binary tree 
T and a chain C (each of N nodes) will specify the mapping of T 
onto C such that no more than one edge per node need be added to C 
to produce an augmented chain C' which has T as a spanning tree. 
In other words, T can be perfectly mapped on C'. This algorithm has 
complexity 0(N) and the mappings that it produces are such that 
C' is always planar. 

This algorithm starts at the root of the tree and proceeds by 
threading each node and its sons in a linked list. When this 
algorithm concludes, all nodes of the tree have been threaded by this 
linked list which thus specifies the order in which the nodes of T 
are to be mapped on nodes of C. 

The given binary tree is assumed to be stored in an array with 
each node having a pointer to its leftson, rightson and a back pointer 
to its father. Two more pointers are required for the linked list. 
Finally, each node has a label which specifies the order in which it 
is threaded. This is a crucial notion in the development of our 
algorithm. Given a node that has already been threaded by the list, 
the two sons of the node can be added to the list in three possible 
ways: inorder, postorder and preorder, which correspond to the well 

known methods of tree traversal [8]. This operation is illustrated in 
Figures 4 and 5. The way in which a node x is labelled depends on 
how the father of x was linked and whether x is a leftson or a 
rightson of its father. A call to procedure link will thread the sons 
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of a node. Thus link(x, predorder) corresponds to the transition 
from Figure 4 to Figure 5(b). Note that the father is labelled with 
the order in which it is linked to its sons. 

Procedure link is not recursive, however procedure tree— link 
(described below) which calls link is. The complete tree threading 
algorithm is given on the following page. The function f ather- 
linked(x) returns label[f ather [x] ] , that is the order in which the 
father of x was threaded. 

Figures 6(a), (b) and (c) show successive stages in the threading 
of a tree and Figure 6(d) shows the fully threaded tree. Figure 7(a) 
shows how a 25 node complete binary tree is threaded. The resultant 
augmented chain in shown in Figure 7(b). 
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procedure tree_link(x: range) ; 
begin 


if x=nil then {return} 
else 

begin 

if x is the leftson of its father then 
case father__linked(x) of 

inorder: link(x, postorder); 
postorder: link(x, inorder); 
preorder: link(x, preorder); 

end 

else {x is the rightson of its father} 
case father_linked(x) of 

inorder: link(x, preorder); 
postorder: link(x, postorder); 
preorder: link(x, inorder); 
end; 

tree_link(lef tsonfx] ) ; 

tree_JLink( rightson [x] ) ; 
end; 

end; 

{main program} 
begin 

start:=root; 
link( root f inorder) ; 
tree_link (leftson [root ] ) ; 
tree_JLink( rightson [root] ) ; 


end 
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V. AUGMENTING NEAREST NEIGHBOR ARRAYS TO ALLOW SHUFFLE-EXCHANGES 

Figure 8 shows a single stage recirculating shuffle-exchange 
network of size N = 8 [12]. This consists of shuffle 
interconnections and N/2 2x2 switches that route the shuffled 
outputs back to the inputs after each shuffle operation. The switches 
can route their inputs to their outputs in either straight through or 
interchange fashion. 

A single stage recirculating shuffle-exchange network of size 
N/2 can be formed out of any rectangular 4 (8) nearest neighbor array 
of size N = 2^ (k > 2) by adding at most one more link per 
processor. The added links are needed for the shuffle 
interconnections. Switch functions can be emulated by the existing 
array interconnections. 

We index the columns of a rectangular 4 or 8 nearest neighbor 
array of size N = 2^ with the numbers 0,1,2, ••• and partition the 
array into two halves consisting of even and odd-numbered columns. 
Shuffle interconnections are added between the processors of the two 
halves* Each group of four processors which straddles evan and odd- 
numbered columns can emulate any switch function in no more than 2(1) 
time steps for 4(8) nearest neighbor arrays. Thus the complete 
augmented array can emulate a shuffle-exchange operation in 3(2) time 
steps, respectively. Figures 8 and 9 illustrate this process. 



- 11 - 


VI. DISCUSSION 

The network augmenting algorithm described in Section III is 
capable of reducing the diameter of any connected graph to 0(log N) 
by adding at most one edge per node. This algorithm will be useful 
for improving the performance of an existing computer network whose 
interconnection structure has evolved over time without regard for 
diameter. By reducing the diameter we reduce the maximum 

communication latency of the network. Our augmentation scheme 
requires at most one additional communication port per node and thus 
no more than N/2 additional links. This algorithm is applicable to 
any connected graph and is thus more general than the algorithms 
described in [4] which are applicable only to Hamiltonian graphs or 
graphs partitionable into cliques of size 3 or greater. 

The algorithm of Section IV for mapping binary trees onto rings 
or chains allows us to solve efficiently binary tree structured 

problems on one-dimensional nearest-neighbor arrays. This algorithm 
indirectly solves the mapping problem [2] for the special case of 
binary trees onto rings or chains. This allows us to maximize usage 
of nearest neighbor links in a chain that has a global bus 

superimposed on it (a problem similar to that discussed in [3]). This 
algorithm may also be used to solve approximately the problem of 

mapping binary trees onto Hamiltonian graphs with cardinality 
guaranteed at no worse than 2/3 of optimal. 

This algorithm may also be used to augment a chain or ring so 

that it contains a complete binary tree and thus suggests new 
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interconnection structures that have all the advantages of binary 
trees as well as those of rings. With a minor modification, it can be 
used to map a complete Moore 3-tree onto a chain or ring and thus 
obtain diameter 2 |log 2 (^p) ] . This algorithm yields augmented 

chains or rings that are "one-sided" planar (Figure 7(b)) in 0(N) 
time and is thus superior to the algorithm given in [4] which is 
0(N ) and does not guarantee planarity. 

Since our augmented rings require the addition of only one edge 
per node and have logarithmic diameter, they are superior to the 
chordal rings of Arden and Lee [1] which have square root diameter and 
to the augmented rings proposed by Pradhan and Reddy [10] which have 
logarithmic diameter but require the addition of two edges per node. 

The algorithm of Section IV can obviously be applied to any 
Hamiltonian graph. Although the problem of finding Hamiltonian paths 
in graphs is intractable in general [6], it is trivial for most 
nearest neighbor arrays. Many array processors have nearest— neighbor 
interconnections. Examples include the llliac-IV, the Finite Element 
Machine [9], and PACS [7]. The n x n nearest neighbor array lends 

itself to the efficient solution of many interesting problems [11], 
[13] but has the disadvantage of an 0(n) diameter which results in 

poor execution of global operations such as sorting or finding 
maximum. We can use our algorithm to augment such arrays to obtain 

networks with all the advantages of nearest neighbor arrays as well as 

those of tree machines. It is interesting to note that only one 
additional layer of interconnecting wires is required for this 


purpose. 


-I 


Finally, we showed in Section V how the powerful perfect-shuffle 
interconnection can be superimposed on a two-dimensional nearest 
neighbor array. This gives us an interconnection pattern with all the 
advantages of nearest neighbor arrays as well as those of the perfect 
shuffle. 
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Figure 4, The sons of node x have not been threaded. 
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(a) Inorder 



(b) Preorder 



Figure 5, Threading the sons of node x in: 

(a) inorder; (b) preorder; and (c) postorder. 
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Figure 6. (a), (b), (c): Successive stages in the threading 

OF A TREE, 

(d): The fully threaded tree. 














(b) 


Figure 7. (a) Threadding a 25 node complete binary treEi 

(b) The resultant augmented chain, 




Figure 8, Single stage recirculating shuffle-exchange network of size N 



Figure 9. Shuffle connection augmented 4x4 4-nearest neighbor array that 

CAN EMULATE THE NETWORK OF FIGURE 8 IN CONSTANT TIME. 
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Figure 10, An augmented 4x8 8-nearest neighbor array 

THAT CAN EMULATE A 16 NODE SHUFFLE. 
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